10,574 research outputs found

    Newton slopes for Artin-Schreier-Witt towers

    Get PDF
    We fix a monic polynomial f(x)∈Fq[x]f(x) \in \mathbb F_q[x] over a finite field and consider the Artin-Schreier-Witt tower defined by f(x)f(x); this is a tower of curves ⋯→Cm→Cm−1→⋯→C0=A1\cdots \to C_m \to C_{m-1} \to \cdots \to C_0 =\mathbb A^1, with total Galois group Zp\mathbb Z_p. We study the Newton slopes of zeta functions of this tower of curves. This reduces to the study of the Newton slopes of L-functions associated to characters of the Galois group of this tower. We prove that, when the conductor of the character is large enough, the Newton slopes of the L-function form arithmetic progressions which are independent of the conductor of the character. As a corollary, we obtain a result on the behavior of the slopes of the eigencurve associated to the Artin-Schreier-Witt tower, analogous to the result of Buzzard and Kilford.Comment: 15 pages, upon the refereed version (to appear in Math. Ann), we fixed two minor errors, one in the proof of Theorem 3.8, the other for Theorem 4.

    Masked Language Model Scoring

    Full text link
    Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one. We show that PLLs outperform scores from autoregressive language models like GPT-2 in a variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an end-to-end LibriSpeech model's WER by 30% relative and adds up to +1.7 BLEU on state-of-the-art baselines for low-resource translation pairs, with further gains from domain adaptation. We attribute this success to PLL's unsupervised expression of linguistic acceptability without a left-to-right bias, greatly improving on scores from GPT-2 (+10 points on island effects, NPI licensing in BLiMP). One can finetune MLMs to give scores without masking, enabling computation in a single inference pass. In all, PLLs and their associated pseudo-perplexities (PPPLs) enable plug-and-play use of the growing number of pretrained MLMs; e.g., we use a single cross-lingual model to rescore translations in multiple languages. We release our library for language model scoring at https://github.com/awslabs/mlm-scoring.Comment: ACL 2020 camera-ready (presented July 2020

    Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity

    Full text link
    There is a recently discovered and intriguing phenomenon called Neural Collapse: at the terminal phase of training a deep neural network for classification, the within-class penultimate feature means and the associated classifier vectors of all flat classes collapse to the vertices of a simplex Equiangular Tight Frame (ETF). Recent work has tried to exploit this phenomenon by fixing the related classifier weights to a pre-computed ETF to induce neural collapse and maximize the separation of the learned features when training with imbalanced data. In this work, we propose to fix the linear classifier of a deep neural network to a Hierarchy-Aware Frame (HAFrame), instead of an ETF, and use a cosine similarity-based auxiliary loss to learn hierarchy-aware penultimate features that collapse to the HAFrame. We demonstrate that our approach reduces the mistake severity of the model's predictions while maintaining its top-1 accuracy on several datasets of varying scales with hierarchies of heights ranging from 3 to 12. We will release our code on GitHub in the near future

    Latest Cosmological Constraints on Cardassian expansion models including the updated Gamma-ray bursts

    Full text link
    In this paper, we constrain the Cardassian expansion models from the latest observations including the updated Gamma-ray bursts (GRBs), which calibrated cosmology-independently from the Union2 compilation of type Ia supernovae (SNe Ia). By combining the GRB data to the joint observations with the Union2 SNe Ia set, along with the Cosmic Microwave Background radiation observation from the seven-year Wilkinson Microwave Anisotropy Probe result, the baryonic acoustic oscillation observation from the spectroscopic Sloan Digital Sky Survey Data Release galaxy sample, we find significant constraints on model parameters of the original Cardassian model ΩM0=0.282−0.014+0.015\Omega_{{\rm M0}}=0.282_{-0.014}^{+0.015}, n=0.03−0.05+0.05n= 0.03_{-0.05}^{+0.05}; and n=−0.16−3.26+0.25n= -0.16_{-3.26}^{+0.25}, β=0.76−0.58+0.34\beta=0.76_{-0.58}^{+0.34} of the modified polytropic Cardassian model, which are consistent with the Λ\LambdaCDM model in 1-σ\sigma confidence region. From the reconstruction of the deceleration parameter q(z)q(z) in Cardassian models, we obtain the transition redshift zT=0.73±0.04z_{\rm T}=0.73\pm{0.04} for the original Cardassian model, and zT=0.68±0.04z_{\rm\rm T}=0.68\pm{0.04} for the modified polytropic Cardassian model.Comment: 11 pages, 5 figures, 1 table; accepted for publication in Res. Astron. Astrophy
    • …
    corecore